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Abstract. Various real- life networks of current interest are simultaneously scale- 
free and modular. Here we study analytically the average distance in a class of 
dcterministically growing scale-free modular networks. By virtue of the recursive 
relations derived from the self-similar structure of the networks, we compute rigorously 
this important quantity, obtaining an explicit closed-form solution, which recovers the 
previous result and is corroborated by extensive numerical calculations. The obtained 
exact expression shows that the average distance scales logarithmically with the number 
of nodes in the networks, indicating an existence of small-world behavior. We present 
that this small- world phenomenon comes from the peculiar architecture of the network 
family. 
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1. Introduction 

Average distance is one of the most important measurements characterizing complex 
networks, which is a subject attracting a lot of interest in the recent physics literature [1, 

2, 3, 4]. Extensive empirical studies showed that many, perhaps most, real networks 
exhibit remarkable small- world phenomenon [5], with their average distance grows as a 
function of network order (i.e., number of nodes in a network), or slowly [1, 2]. As a 
fundamental topological property, average distance is closely related to other structural 
characteristics, such as degree distribution [6, 7], centrality [8], fractality [9, 10, 11], 
symmetry [12], and so forth. All these features together play significant roles in 
characterizing and understanding the complexity of networks. Moreover, average 
distance is relevant to various dynamical processes occurring on complex networks, 
including epidemic spreading [5], target search [13], synchronization [14], random 
walks [15, 16, 17], and many more. 

In addition to the small-world behavior, other two prominent properties that seem 
to be common to real networks, especially biological and social networks, are scale-free 
feature [18] and modular structure [19, 20, 21]. The former implies that the networks 
obey a power-law degree distribution as P(k) ~ k" 1 with 2 < 7 < 3, while the latter 
means that the networks can be divided into groups (modules), within which nodes are 
more tightly connected with each other than with nodes outside. In order to describe 
simultaneously the two striking properties, Ravasz and Barabasi (RB) presented a 
famous model [21], mimicking scale-free modular networks. Many topological properties 
of and dynamical processes on the RB model have been investigated in much detail, 
including degree distribution [21], clustering coefficient [21, 22], betweenness centrality 
distribution [22], community structure [23], random walks [24, 25], among others. 
Particularly, by mapping the networks onto a Potts model in one-dimensional lattices, 
Noh proved that the RB model is small- world [22]. 

In this paper, we study the average distance in the RB model by using an alternative 
approach very different from the previous one [22]. Our computation method is based 
on the particular deterministic construction of the RB model. Concretely, making use of 
the self-similar structure of the scale-free modular networks, we establish some recursion 
relations, from which we further derive the exactly analytical solution to the average 
distance. Our obtained rigorous expression is compatible with the previous formula. We 
show that the RB model is small-world. We also show that the small-world behavior 
is a natural result of the scale-free and modular architecture of the networks under 
consideration. 

2. The modular scale-free networks 

We first introduce the RB model for the scale-free modular networks, which are built in 
an iterative way [20, 21]. Let H g stand for the network model after g (g > 1) iterations 
(i.e., number of generations). Initially (g = 1), the model is composed m (m > 3) nodes 
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Figure 1. (Color online) Sketch of a network H3 for the case of m = 4. The filled 
squares and circles represent the hub node and peripheral nodes, respectively. 

linked by m(m — l)/2 edges forming a complete graph, among which a node (e.g., the 
central node in figure 1) is called hub (or root) node, and the other m — 1 nodes are 
named peripheral nodes. At the second generation (g = 2), m — 1 replicas of Hi are 
created with the m — 1 peripheral nodes of each copy being connected to the root of the 
original Hi. In this way, we obtained H 2 , the hub and peripheral nodes of which are the 
hub of the original Hi and the (m — 1) 2 peripheral nodes in the m — 1 duplicates of Hi, 
respectively. Suppose one has H g _i, the next generation network H g can be obtained by 
adding M — 1 copies of H g _i to the primal H g _i, with all peripheral nodes of the replicas 
being linked to the hub of the original H g _i unit. The hub of the original H g _i and the 
peripheral nodes of the m — 1 copies of H g _i form the hub node and peripheral nodes 
of H g , respectively. Repeating indefinitely the two steps of replication and connection, 
one obtains the scale-free modular networks. Figure 1 illustrates a network H 3 for the 
particular case of m = 4. 

Many interesting quantities of the model can be determined explicitly [21, 22]. In 
H g , the network order, denoted by N g is N g = m 9 ; the degree K h (g) = ^5|[(m — l) s — 1] 
of the hub node is the largest among all nodes; the number of peripheral nodes, forming 
a set F g , is P g = (m — l) 9 ; and the average degree is approximately equal to a constant 
2(m — l)(3m — 2)/m in the limit of infinite g, showing that the networks are sparse. 

The model under consideration is in fact an extension of the one proposed in [26] 
and studied in much detail in [27, 28, 29]. It presents some typical features observed in a 
variety of real- world systems [21, 22]. Its degree distribution follows a power-law scaling 
P(k) ~ /c -7 with a general exponent 7 = 1 + lnm/ln(m — 1) belonging to the interval 
(2,2.585). Its average clustering coefficient tends to a large constant dependent on m; 
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and its average distance grows logarithmically with the network order, both of which 
show that the model is small-world. In addition, the betweenness distribution P# of 
nodes also obeys the power-law behavior Pb ~ B~ 2 with the exponent regardless of the 
parameter m. Particularly, the whole class of the networks shows a remarkable modular 
structure. These peculiar structural properties make the networks unique within the 
category of complex networks. 



3. Explicit formula for average distance 

As shown in the introduction section, average distance is closely related to many 
topological properties of and various dynamical processes on complex networks. In 
what follows, we will derive analytically the average distance of the scale-free modular 
networks by applying an alternative method completely different from that in [22]. We 
represent all the shortest path lengths of network H g as a matrix in which the entry 
dij(g) is the distance between nodes % and j that is the length of a shortest path joining 
% and j. A measure of the typical separation between two nodes in H g is given by the 
average distance d g defined as the mean of distances over all pairs of nodes: 

d ° = N g {N g - l)/2 ' (1) 

where 

D g = £ dais) (2) 
ieH g ,jeHg,i^j 

denotes the sum of the distances between two nodes over all couples. Notice that in 
Eq. (2), for a pair of nodes % and j (i ^ j), we only count dij(g) or dji(g), not both. 

We continue by exhibiting the procedure of determining the total distance and 
present the recurrence formula, which allows us to obtain D g+ \ of the g + 1 generation 
from D g of the g generation. The studied network H g has a self-similar structure that 
allows one to calculate D g analytically. By construction (see figure 2), network H g+ \ is 
obtained by joining m copies of H g that are labeled as H g \ H g 2 \ ■ ■ •, Hg. Using this 
self-similar property, the total distance D g+ i satisfies the recursion relation 

D g+1 = m.Dg + A g , (3) 

where A g is the sum over all shortest path length whose endpoints are not in the same 
Hg branch. The paths that contribute to A g must all go through the hub node X, 
where the m copies of H g are connected. Hence, to determine D g , all that is left 
is to calculate A g . The analytic expression for A g , referred to as the crossing path 
length [30, 31], can be derived as below. 

Let A g a ^ be the sum of the lengths of all shortest paths whose endpoints are in 
H g a ^ and H^\ respectively. According to whether the two branches are one link long or 
two links long, we split the crossing paths A g a '^ into two categories: the first category 
composes of crossing paths A g 1,9 ^ (2 < 9 < m), while the second category consists of 
crossing paths A g dl ' 9 ^ with 2 < 9\ < m, 2 < 9i < m, and 9\ ^ 9i- It is easy to see that 
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Figure 2. (Color online) Schematic illustration of the means of construction of the 
scale-free modular networks. H g +\ is obtained by joining m replicas of H g denoted as 
((p = 1, 2, ••-,m), which are connected to one another by linking all the peripheral 
nodes of #j x) (2 < x < m) to the hub node (denoted by X) of H { g 1] . 



the numbers of the two categories of crossing paths are m — 1 and (m — l)(m — 2)/2, 
respectively. Moreover, any two crossing paths in the same category have the same 
length. Thus, the total sum A g is given by 

, . rt (m — l)(m — 2) . /o 
A s = (m - 1)A^ 2 ) + i ^ ^Af 3 ) . (4) 

( 1 2) f 2 3) 

Having A g in terms of the quantities of Ag ' and A g , the next step is to explicitly 
determine the two quantities. 

( 1 2) (2 3) 

To calculate the crossing distance Ag ' and A g , we give the following notation. 
For an arbitrary node v in network H g , let f v (g) be the smallest value of the shortest 
path length from v to any of the (m — l) 9 peripheral nodes belonging to P s , and the 
sum of f v (g) for all nodes in H g is denoted by Analogously, in H g let ^(p) denote 
the distance from a node v to the hub node X, and let M 5 stand for the total distance 
between all nodes in H g and the hub node X in H g , including X itself. By definition, 
F g+ i can be given by the sum 

m 



^ [^(p) + 1] + (m - 1) 



■jeHa 



(m-l)F g + N g + M g , (5) 
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and M g+ i can be written recursively as 

in 

M g+1 = h v (g + l) + J2 E Mtf+l) 

= h M + (m - 1) £ [/„(</) + 1] 

veHg V£Hg 

= M g +(m-l)(F g + N g ). (6) 

Using N g = m 9 , and considering Fi = 1 and M 1 = m — 1, the simultaneous equations (5) 
and (6) can be solved inductively to obtain: 

F g = m 9 ~ 2 [{2g - l)m - 2g + 2] (7) 

and 

M g = m?- 2 {m - l)(m + 2g - 2) . (8) 

( 1 2) (2, 3) 

With above obtained results, we can determine A), ' and A g , which can be 
expressed in terms of these explicitly determined quantities. By definition, A g is 
given by the sum 



4 1,2) = E d ^9 + 1) 

^ [K{g) + i + fv{g) 



= E E ^)+ E E i 1 +/«(*)] 

= A^M 9 + (Ag 2 + iV,F 9 . (9) 
Inserting Eqs. (7) and (8) into (9), we have 

A^' 2) = 2 m 29 ~ A [m 2 + 2(g - 2)m - 2g + A] . (10) 
Proceeding similarly, 

Af 3) = E <Ws + 1) 

= 2[(iV 9 ) 2 + iV 3 F 9 ] 

= 2 m 25 " 4 [m 2 + (20 - 3)m - 2# + 4] . (11) 

Substituting Eqs. (10) and (11) into (4), we get 

A g = m 29 ~ 3 (m-l) 2 ( y m + 2g-2). (12) 

Substituting Eq. (12) into (2) and using the initial value D\ = m(m — l)/2, we can 
obtain the exact expression for the total distance 

D g = ^m 9 ^ 2 [Am + m 2 - m 3 + 2m 9+2 - Ag m 9 + (Ag - 6) m 9+1 ] . (13) 
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Figure 3. Average distance d g versus network order N g on a semi-logarithmic scale. 
The solid lines are guides to the eye. 



The expression provided by Eq. (13) is consistent with the result previously 
obtained [22]. Then the analytic expression for average distance can be obtained as 
4m + m 2 — m 3 + 2m 9+2 — 4gm 9 + (4g — 6) m 9+1 
9 = m 2 (m 9 - 1) ' ( } 

We have also checked our rigorous result provided by Eq. (14) against numerical 
calculations for different m and various g. In all the cases we obtain a complete 
agreement between our theoretical formula and the results of numerical investigation, 
see figure 3. 

We continue to express the average distance d g as a function of network order N g , 
in order to obtain the scaling between these two quantities. Recalling that N g = m 9 , 
we have g = \og m N g . Hence Eq. (14) can be rewritten as 

4m + m 2 — m 3 + 4(m — l)N g log m N g + (2m 2 — 6m)N g 
9 = m 2 (N g - 1) 

4(m — l)Ng\nN g + (2m 2 — Qm)N g lnm + (4m + m 2 — m 3 ) lnm 

m 2 lnm (N g — 1) 
In the infinite network order limit, i.e., N g — > oo 

(is = l(^zi) lnArs + 2 _£. (16) 

m A mm m 
Thus, for large networks, the leading behavior of average distance grows logarithmically 
with increasing network order. 

The above observed small-world phenomenon that the leading behavior of average 
distance is a logarithmic function of network order can be accounted for by the following 
heuristic arguments based on the peculiar architecture of the networks. At first sight, 
this family of modular networks is not a very compact system, since in these networks, 
nodes with large degrees are not directly linked to one another, but connected to those 
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nodes with small degree. However, this network family is made up of many small 
densely interconnected clusters, which combine to form larger but less compact groups 
connected by nodes with relatively high degrees. For node pairs in a small group, their 
shortest path length is very small because of the high cohesiveness of small modules. 
For the length of shortest paths between two nodes belonging to different large groups, 
it seems long because the groups that the nodes lie at are not adjacent to each other. 
But this is not the fact. By construction, although the relatively large groups are not 
directly adjacent, they are joined by some large nodes, which are connected to each 
other by a layer of intermediate small-degree nodes (see figure 1), such as the peripheral 
nodes or locally peripheral nodes [22]. Thus, different from conventional random scale- 
free networks, especially assortative networks [32], in the studied scale- free modular 
networks, although large-degree nodes are not connected to one another, they play the 
role of bridges linking different modules together, which is the main reason why the 
average distance of the networks is small. 

It deserves to be mentioned that, although the studied modular scale-free networks 
display small-world behavior, the logarithmic scaling of average distance with respect to 
network order is different from the sublogarithmic scaling for conventional non-modular 
stochastic scale-free networks with degree distribution exponent 7 < 3, in which the 
average distance d(N) behaves as a double logarithmic scaling with network order N, 
namely, d(N) ~ lnlniV [6, 7]. Thus, despite that the degree distribution exponent 
of the modular scale-free networks is smaller than 3, their average distance is larger 
than that of their random counterparts with the same network order. The root of this 
difference may also lie with the modular structure, particularly the indirect connection 
of large nodes, as addressed above. The genuine reasons for this dissimilarity need 
further studies in the future. 

4. Conclusions 

The determination and analysis of average distance is important to understand the 
complexity of and dynamic processes on complex networks, which has been a subject 
of considerable interest within the physics community. In this paper, we investigated 
analytically the average distance in a class of deterministically growing networks with 
scale-free behavior and modular structure, which exist simultaneously in a plethora 
of real-life networks, such as social and biological networks. Based on the self-similar 
structure of the networks, we derived the closed-form expression for the average distance. 
The obtained exact solution shows that for very large networks, they are small-world 
with their average distance increasing as a logarithmic function of network order. We 
confirmed the rigorous solution by using extensive numerical simulations. We also 
showed that the small-world behavior lies with the inherent modularity and scale-free 
property of the networks. 



An alternative approach to determining average distance in a class of scale-free modular networks^ 



Acknowledgment 

We would like to thank Xing Li for his support. This research was supported by 
the National Natural Science Foundation of China under Grants No. 60704044, No. 
60873040, and No. 60873070, the National Basic Research Program of China under 
Grant No. 2007CB310806, Shanghai Leading Academic Discipline Project No. B114, 
the Program for New Century Excellent Talents in University of China (Grants No. 
NCET-06-0376), and Shanghai Committee of Science and Technology (Grants No. 
08DZ2271800 and No. 09DZ2272800). 

References 

[1] R. Albert and A.-L. Barabasi, Rev. Mod. Phys. 74, 47 (2002). 

[2] S. N. Dorogovtscv and J. F. F. Mendes, Adv. Phys. 51, 1079 (2002). 

[3] M. E. J. Newman, SIAM Rev. 45, 167 (2003). 

[4] S. Boccalctti, V. Latora, Y. Moreno, M. Chavez and D.-U. Hwanga, Phys. Rep. 424, 175 (2006). 

[5] D. J. Watts and H. Strogatz, Nature (London) 393, 440 (1998). 

[6] F. Chung and L. Lu, Proc. Natl. Acad. Sci. U.S.A. 99, 15879 (2002). 

[7] R. Cohen and S. Havlin, Phys. Rev. Lett. 90, 058701 (2003). 

[8] S. N. Dorogovtsev, J. F. F. Mendes, and J. G. Olivcira, Phys. Rev. E 73, 056122 (2006). 

[9] C. Song, S. Havlin, H. A. Makse, Nature Phys. 2, 275 (2006). 
[10] Z. Z. Zhang, S. G. Zhou, and T. Zou, Eur. Phys. J. B 56, 259 (2007). 
[11] Z. Z. Zhang, S. G. Zhou, L. C. Chen, and J. H. Guan, Eur. Phys. J. B 64, 277 (2008). 
[12] Y. Xiao, B. D. MacArthur, H. Wang, M. Xiong, and W. Wang, Phys. Rev. E 78, 046102 (2008). 
[13] L. A. Adamic, R. M. Lukose, A. R. Puniyani, and B. A. Hubcrman, Phys. Rev. E 64, 046135 

(2001) . 

[14] T. Nishikawa, A. E. Motter, Y.-C. Lai, and F. C. Hoppensteadt, Phys. Rev. Lett. 91, 014101 
(2003). 

[15] S. Condamin, O. Benichou, V. Tejedor, R. Voituriez, and J. Klafter, Nature (London) 450, 77 
(2007). 

[16] Z. Z. Zhang, Y. Lin, S. G. Zhou, B. Wu, and J. H. Guan, New J. Phys. 11, 103043 (2009). 
[17] Z. Z. Zhang, Y. Qi, S. G. Zhou, S. Y. Gao, and J. H. Guan, Phys. Rev. E (in press). 
[18] A.-L. Barabasi and R. Albert, Science 286, 509 (1999). 

[19] M. Girvan and M. E. J. Newman, Proc. Natl. Acad. Sci. U.S.A. 99, 7821 (2002). 

[20] E. Ravasz, A. L. Somera, D. A. Mongru. Z. N. Oltvai, and A.-L. Barabasi, Science 297, 1551 

(2002) . 

[21] E. Ravasz and A.-L. Barabasi, Phys. Rev. E 67, 026112 (2003). 

[22] J. D. Noh, Phys. Rev. E 67, 045103(R) (2003). 

[23] H. J. Zhou, Phys. Rev. E 67, 061901 (2003). 

[24] J. D. Noh and H. Ricgcr, Phys. Rev. E 69, 036111 (2004). 

[25] Z. Z. Zhang, Y. Lin, S. Y. Gao, S. G. Zhou, J. H. Guan, and M. Li, Phys. Rev. E, in press. 
[26] A.-L. Barabasi, E. Ravasz, and T. Vicsck, Physica A 299, 559 (2001). 
[27] K. Iguchi and H. Yamada, Phys. Rev. E 71, 036144 (2005). 
[28] E. Agliari and R. Burioni, Phys. Rev. E 80, 031125 (2009). 

[29] Z. Z. Zhang, Y. Lin, S. Y. Gao, S. G. Zhou, and J. H. Guan, J. Stat. Mech. (2009) P10022. 
[30] M. Hinczewski and A. N. Berker, Phys. Rev. E 73, 066126 (2006). 

[31] Z. Z. Zhang, J. H. Guan, B. L. Ding, L. C. Chen, and S. G. Zhou, New J. Phys. 11, 083007 
(2009). 

[32] M. E. J. Newman, Phys. Rev. Lett. 89, 208701 (2002). 



